Speaker-independent word recognition by less cost and stochastic dynamic time warping method
نویسندگان
چکیده
In this paper, we describe some considerations on a speaker-independent word recognition method on a large vocabulary size by the concatenation of syllable templates and a stochastic dynamic time warping method, where syllable templates are taken from spoken words. We got the reference patterns from 216 words uttered by 30 male speakers and recognized the other 200 words uttered by the other 10 speakers. The standard dynamic time warping method for speakerindependent recognition on 200 words gave the average word recognition rate of 89.3%. The stochastic dynamic time warping method we proposed here improved the recognition rate to 92.9%.
منابع مشابه
تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملBuilding Sequence Kernels for Speaker Verification and Word Recognition
This chapter describes the adaptation and application of kernel methods for speech processing. It is divided into two sections dealing with speaker verification and isolated-word speech recognition applications. Significant advances in kernel methods have been realised in the field of speaker verification, particularly relating to the direct scoring of variable-length speech utterances by seque...
متن کاملDevelopment of Isolated Word Speech Recognition System
The isolated word speech recognition system based on dynamic time warping (DTW) has been developed. Speaker adaptation is performed using speaker recognition techniques. Vector quantization is used to create reference templates for speaker recognition. Linear predictive coding (LPC) parameters are used as features for recognition. Performance is evaluated using 12 words of Lithuanian language p...
متن کاملSpeaker normalized acoustic modeling based on 3-D Viterbi decoding
This paper describes a novel method for speaker normalization based on a frequency warping approach to reduce variations due to speaker-induced factors such as the vocal tract length. In our approach, a speaker normalized acoustic model is trained using time-varying (i.e., state, phoneme or word dependent) warping factors, while in the conventional approaches, the frequency warping factor is xe...
متن کاملSpeaker independent recognition of isolated words using clustering techniques
A speaker-independent isolated word recognition system is described which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analysis of a large database consisting of 100 replications of each word (i.e., once by each of 100 bIkers). The recognition system, which accepts telephone quality speech input, is base...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1987